Experiences with using Data Cleaning Technology for Bing Services
نویسندگان
چکیده
Over the past few years, our Data Management, Exploration and Mining (DMX) group at Microsoft Research has worked closely with the Bing team to address challenging data cleaning and approximate matching problems. In this article we describe some of the key Big Data challenges in the context of these Bing services primarily focusing on two key services: Bing Maps and Bing Shopping. We describe ideas that proved crucial in helping meet the quality, performance and scalability goals demanded by these services. We also briefly reflect on the lessons learned and comment on opportunities for future work in data cleaning technology for Big Data.
منابع مشابه
Exploring the experiences of women with pelvic floor disorders with received medical services: a qualitative study
Background & Aim:Despite the high prevalence of the problem in the society, people with pelvic floor disorders are less likely to seek medical services. Various studies stated that one of the important problems in providing medical services to people with pelvic floor disorders is not paying attention to their preferences and expectations. Therefore, this study was conducted with the aim of exp...
متن کاملمروری بر کاربرد فناوری پلاسما در حفاظت آثار فرهنگی و تاریخی
Nowadays, over time and increasing the awareness of the destructive effects of the use of chemical and toxic substances on the objects, the environment and users, the replacement or, minimum use of these harmful materials in the treatment and protection of valuable and rare objects is a priority. So throughout the world, researchers are seeking to develop and use safe and standard methods in th...
متن کاملThe Readiness of Hospitals to Implement the RFID Technology
Introduction: This study is implemented with the aim of a systematic collecting and reviewing of conducted researches in connection with the implementation of the Radio-frequency identification (RFID) technology. Methods: This study has examined the existent literatures in databases such as Google Scholar, ISI Web of Knowledge and Science Direct by using ...
متن کاملTowards a Domain Independent Platform for Data Cleaning
We present a domain independent platform for data cleaning developed as part of the Data Cleaning project at Microsoft Research. Our platform consists of a set of core primitives and design tools that allow a programmer to develop sophisticated data cleaning solutions with minimal programming effort. Our primitives are designed to allow rich domain and application specific customizations and ca...
متن کاملVirtual Services in Data Grids
Data grids enable next generation scientific explorations that require intensive computation and analysis of petabyte-scale shared data collections. Apart from the challenge in creation and management of the data, another major challenge is the discovery of derived data products that have already been created. This work addresses the later challenge in minimizing response time and conserving th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Data Eng. Bull.
دوره 35 شماره
صفحات -
تاریخ انتشار 2012